How to Monitor and Optimize Batched Deletion Processes in SQL Server

Comments 0

Share to social media

Batched deletions are a common strategy in SQL Server to manage large datasets without overloading the system, but poorly tuned deletes can cause blocking, long-running transactions, and heavy log usage. Learn how to monitor and optimize these processes for smooth, efficient database performance.

In previous articles I showed patterns for working with large amounts of data on big tables while keeping locking at a minimum. These processes can allow migrations and maintenance without requiring downtime but, in environments with unpredictable database workloads, there is a risk of heavy traffic starting at any time and disrupting a once smooth operation. In this article, I’ll demonstrate how to augment these processes to allow dynamic adjustment of the configuration.

For most systems, the main limitation these techniques run into is the speed and throughput of I/O (input/output). During periods of low traffic, a large batch size may perform great with no impact to production, but as traffic increases, the storage subsystem may not be able to keep up.

I’ll show two workarounds to deal with this issue: lowering the batch size, and introducing a small delay in between batches. Both will allow the process to continue running with less I/O demand but, if this still isn’t enough, we can easily stop the process and restart at a different time.

Logging Tables in SQL Server

Before we get into the configuration options, let’s consider how we can get feedback on our process while it’s running, so we’re better informed about the adjustments we want to make. Let’s review the code for purging in batches explored in this article:

Now, we’ll add a new table to hold our logging:

This table will contain a row for every iteration of the loop, tracking the time it takes for the DELETE statement to run (because that’s what will alert us to any blocking). Depending on the latency demands of your system, even a 1-2 second iteration may be too slow, but other systems may function without issue as long as this is below the default timeout of 30 seconds.

Besides the time tracking, we’ll also record the values of our configuration parameters and any errors that come up during the operation. We do this by inserting a record at the beginning of our loop:

And by updating that record at the end of our loop:

Configuration Tables in SQL Server

In our previous script we used a static variable to hold the BatchSize, but we’ll now store this in a table along with other parameters to adjust the process.

This allows us to make changes on the fly without stopping the script and rolling back a midflight transaction. It also opens up the possibility of using an automated process to tweak these parameters based on system load.

Let’s look at our configuration table:

This table contains our parameters as well as a check constraint to ensure we only have one row stored at any time. We’ll update the running variables at the end of each loop (as well as at the beginning of the script):

Error Handling in SQL Server

While we are making improvements to the script, let’s add some polish by introducing error handling inside the main loop. This can be as simple as adding a TRY/CATCH block and some variables to store the error message and number.

You might also opt to set @Stop = 1 in this section if you want the script to stop any time it hits an error. In this example, I’m letting it continue because I can address the error and rerun the script later; any records that failed to delete will be scooped up by the insert to the #ToProcess table on the next run.

Putting It All Together

Now let’s look at what our script looks like with these new tables:

After we kick off this script we can monitor the progress by querying the PurgeLogging table:

This results in each batch taking less than 200 milliseconds to complete:

An image showing each batch taking less than 200 milliseconds to complete.

Then, if we want to change the batch size or add a delay in between batches, we can update the PurgeConfig table like so:

We can see in our logging table that the change takes effect seamlessly:

An image showing that the change takes effect seamlessly.

Notice that there’s now a second gap between the StartTime and EndTime of the previous row, which could help slower disks keep up.

We can also see that each batch is now taking more than a second – if we feel this is too long, we can update our PurgeConfig table once again to lower the batch size:

Image showing updating the PurgeConfig table to lower the batch size.

Conclusion

Adding logging and dynamic configuration to this technique allows us to tune the process to the unique capabilities and requirements of any environment. By looking at the time each loop takes to execute, we can adjust the batch size to keep our impact to an acceptable amount. If our I/O is saturated, we can add a delay in between batches to allow other processes to complete.

This technique can be used to purge old data as I have shown here, but it can also be used for more advanced processes like changing the datatype on a column while a table is being used, or deleting from multiple tables with foreign key relationships.

Article tags

Load comments

About the author

Matt Gantz

See Profile

Matt is a Database Administrator with over 15 years of experience working with SQL Server. He enjoys teaching, playing music, and climbing the snowy peaks of the Pacific Northwest.